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Abstract —The Rowhammer bug is a reliability issue in DRAM 
cells that can enable an unprivileged adversary to flip the values 
of bits in neighboring rows on the memory module. Previous 
work has exploited this for various types of fault attacks across 
security boundaries, where the attacker flips inaccessible bits, 
often resulting in privilege escalation. It is widely assumed 
however, that bit flips within the adversary’s own private memory 
have no security implications, as the attacker can already modify 
its private memory via regular write operations. 

We demonstrate that this assumption is incorrect, by em¬ 
ploying Rowhammer as a read side channel. More speciflcally, 
we show how an unprivileged attacker can exploit the data 
dependence between Rowhammer induced bit flips and the bits 
in nearby rows to deduce these bits, including values belonging to 
other processes and the kernel. Thus, the primary contribution 
of this work is to show that Rowhammer is a threat to not only 
integrity, but to confldentiality as well. 

Furthermore, in contrast to Rowhammer write side channels, 
which require persistent bit flips, our read channel succeeds even 
when ECC memory detects and corrects every bit flip. Thus, 
we demonstrate the first security implication of successfully- 
corrected bit flips, which were previously considered benign. 

To demonstrate the implications of this read side channel, we 
present an end-to-end attack on OpenSSH 7.9 that extracts an 
RSA-2048 key from the root level SSH daemon. To accomplish 
this, we develop novel techniques for massaging memory from 
user space into an exploitable state, and use the DRAM row- 
buffer timing side channel to locate physically contiguous mem¬ 
ory necessary for double-sided Rowhammering. Unlike previous 
Rowhammer attacks, our attack does not require the use of 
huge pages, and it works on Ubuntu Linux under its default 
configuration settings. 

Index Terms —Side channels, Rowhammer, OpenSSH 

I. Introduction 

In recent years, the discrepancy between the abstract model 
used to reason about computers and their actual hardware 
implementation has lead to a myriad of security issues. These 
range from microarchitectural attacks [15] that exploit con¬ 
tention on internal components to leak information such as 
cryptographic keys or keystroke timing [18, 45, 65], through 
transient execution attacks [10, 35, 39, 60, 63] that break down 
fundamental OS isolation guarantees, to memory integrity 
attacks [9, 32, 34, 36] that exploit hardware limitations to 
change the contents of data stored in the device. 

Rowhammer [19, 34, 55] is a fault attack, in which the 
attacker uses a specific sequence of memory accesses that 
results in bit flips, i.e., changes in bit values, in locations 
other than those accessed. Because the attacker does not 


directly access the changed memory location, the change is 
not visible to the processor or the operating system, and is 
not subject to any permission checks. Thus far, this ability to 
reliably flip bits across security boundaries has been exploited 
for sandbox escapes [19, 55], privilege escalation attacks on 
operating systems and hypervisors [19, 21, 51, 55, 61, 64], 
denial-of-service attacks [21, 28], and even for fault injection 
in cryptographic protocols [6]. 

A common theme for all past Rowhammer attacks is 
that they break memory integrity. Namely, the attacker uses 
Rowhammer to obtain a (limited) write primitive into oth¬ 
erwise inaccessible memory, and subsequently modifies the 
contents of that memory in a way that aligns with the attacker’s 
goals. This observation has led to various mitigation proposals 
designed to secure the target’s memory by using integrity 
checks [62], or by employing ECC (error-correcting code) 
memory to ensure memory integrity. The latter, in particular, 
has long been touted as a defense against Rowhammer-based 
attacks. Even when an attacker flips a bit in memory, the 
ECC mechanism corrects the error, halting the attack. While 
recent work has demonstrated that an attacker can defeat the 
ECC mechanism, resulting in observable bit-flips after error 
correction [13], successfully corrected flips are still considered 
benign, without any security implications. Thus, in this paper 
we pose the following questions: 

• Is the threat posed by Rowhammer limited only to memory 
integrity and, in particular, can the Rowhammer effect be 
exploited for breaching confidentiality? 

• What are the security implications of corrected bit flips? 
Can an attacker use Rowhammer to breach confidentiality 
even when ECC memory corrects all flipped bits? 

A. Our Contributions 

In this paper, we answer these questions in the affirmative. 
More specifically, we present RAMBleed, a new Rowhammer- 
based attack that breaks memory confidentiality guarantees 
by acquiring secret information from other processes running 
on the same hardware. Remarkably, RAMBleed can break 
memory confidentiality of ECC memory, even if all bit flips are 
successfully corrected by the ECC mechanism. After profiling 
the target’s memory, we show how RAMBleed can leak secrets 
stored within the target’s physical memory, achieving a read 
speed of about 3-4 bits per second. Einally, we demonstrate 
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the threat posed by RAMBleed by recovering an RSA 2048- 
bit signing key from an OpenSSH server using only user level 
permissions. 

Data-Dependent Bit Flips. The main observation behind 
RAMBleed is that bit flips depend not only on the bit’s 
orientation, i.e., whether it flips from 1 to 0 or from 0 to 1, but 
also on the values of neighboring bits [34]. Speciflcally, true 
bits tend to flip from 1 to 0 when the bits above and below 
them are 0, but not when the bits above and below them are 
1. Similarly, anti bits tend to flip from 0 to 1 when the bits 
above and below them are 1, but not when the bits above and 
below them are 0. While this observation dates back to the 
very first Rowhammer paper [34], we show how attackers can 
use it to obtain a read primitive, thereby learning the values 
of nearby bits which they might not be allowed to access. 
RAMBleed Overview. Suppose an attacker wants to 
determine the value of a bit in a victim’s secret. The attacker 
first templates the computer memory to And a flippable bit 
at the same offset in a memory page as the secret bit. (For 
the rest of the discussion we assume a true bit, i.e., one that 
flips from 1 to 0.) The attacker then manipulates the memory 
layout to achieve the arrangement depicted below: 


Row Activation Page 

Secret 

Unused 

Sampling Page 

Row Activation Page 

Secret 


Here, each memory row spans two memory pages of size 
4KiB. The attacker uses the Row Activation pages for ham¬ 
mering, the Sampling page contains the flippable bit, which 
is initialized to 1, and Secret pages contain the secret victim 
data that the attacker aims to learn. If the value of the secret 
bit is 0, the layout results in a flippable 0-1-0 configuration, 
i.e., the flippable bit is set to 1, and the bits directly above 
and below it are 0. Otherwise, the secret bit is 1, resulting in 
a 1-1-1 configuration, which is not flippable. 

Next, the attacker repeatedly accesses the two activation 
pages she controls (left top and bottom rows), thereby ham¬ 
mering the middle row. Because the Rowhammer effects are 
data dependent, this hammering induces a bit flip in the 
sampling page in the case that the secret bit is 0. The attacker 
then accesses the sampling page directly, checking for a bit 
flip. If the bit has flipped, the attacker deduces that the value 
of the secret bit is 0. Otherwise, the attacker deduces that 
the value is 1. Repeating the procedure with flippable bits at 
different offsets in the page allows the attacker to recover all 
of the bits of the victim’s secret. 

We note here that neither the victim nor the attacker access 
the secrets in any way. Instead, by accessing the attacker- 
controlled row activation pages, the attacker uses the victim’s 
data to influence Rowhammer-induced bit flips in her own 
private pages. Finally, the attacker directly checks the sampling 
page for bit flips, thereby deducing the victim’s bits. As such, 
RAMBleed is a cross address space attack. 

ECC Memory. ECC memory has traditionally been consid¬ 
ered an effective defense against Rowhammer-based attacks. 
Even when an attacker flips a bit in memory, the hard¬ 


ware’s ECC mechanisms simply revert back any Rowhammer- 
induced bit flips. However, recent work has demonstrated 
that an attacker can defeat the ECC mechanism by inducing 
enough carefully-placed flips in a single codeword, resulting 
in observable bit-flips after error correction [13]. 

In this paper, however, we show that even ECC-corrected 
bit flips may have security implications. This is because 
RAMBleed does not necessarily require the attacker to read 
the bit to determine if it has flipped. Instead, all the attacker 
requires for mounting RAMBleed is an indication that a bit in 
the sampling page has flipped (and subsequently corrected). 
Unfortunately, as Cojocar et al. [13] show, the synchronous 
nature of the ECC correction algorithm typically exposes such 
information through a timing channel, where memory accesses 
that require error correction are measurably slower than normal 
accesses. 

Thus, we can exploit Rowhammer-induced timing variation 
to read data even from ECC memory. In particular, our work is 
the first to highlight the security implications of successfully 
corrected flips, hitherto considered to be benign. 

Memory Massaging. One of the main challenges for mount¬ 
ing RAMBleed, and Rowhammer-based attacks in general, is 
achieving the required data layout in memory. Past approaches 
rely on one or more mechanisms which we now describe. The 
first practical Rowhammer attack relied on operating system 
interfaces (e.g., /proc/pid/pagemap in Linux) to perform 
virtual-to-physical address translation for user processes [55]. 
Later attacks leveraged huge pages, which give access to 
large chunks of consecutive physical memory [19], thereby 
providing sufficient information about the physical addresses 
to mount an attack. Other attacks utilized memory grooming or 
massaging techniques [61], which prepare memory allocators 
such that the target page is placed at the attacker-chosen phys¬ 
ical memory location with a high probability. An alternative 
approach is exploiting memory deduplication [7, 51], which 
merges physical pages with the same contents. The attacker 
then hammers its shared read-only page, which is mapped to 
the same physical memory location as the target page. 

However, many of these mechanisms are no longer available 
for security reasons [42, 52, 57, 61]. Thus, as a secondary 
contribution of this paper, we present a new approach for 
massaging memory to achieve the desired placement. Our 
approach builds on past works that exploit the Linux buddy 
allocator to allocate blocks of consecutive physical mem¬ 
ory [11, 61]. We extend these works by demonstrating how 
an attacker can acquire some physical address bits from the 
allocated memory. We further show how to place secret- 
containing pages at desired locations in the physical memory. 

Einally, we note that this method may have independent 
value for mounting Prime-^Probe last-level cache attacks [40] . 
This is since it allows the attacker to deduce physical addresses 
of memory regions, thereby aiding eviction set construction. 
Extracting Cryptographic Keys. To demonstrate the 
effectiveness of RAMBleed, we use it to leak secrets across 
process boundaries. Specifically, we use RAMBleed against 
an OpenSSH 7.9 server (newest version at time of writ- 
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ing), and successfully read the bits of an RSA-2048 key 
at a rate of 0.3 bits per second, with 82% accuracy. We 
combine the attack with a variant of the Heninger-Shacham 
algorithm [23, 24, 46] designed to recover RSA keys from 
partial information, achieving complete key recovery. 
Summary of Contributions. In this paper we make the 
following contributions: 

• We demonstrate the first Rowhammer attack that breaches 
confidentiality, rather than integrity (Section IV). 

• We abuse the Linux buddy allocator to allocate a large block 
of consecutive physical addresses, and show how to recover 
some of the physical address bits (Section V-A). 

• We design a new mechanism, which we call Frame Feng 
Shui, for placing victim program pages at a desired location 
in the physical memory (Section V-C). 

• We demonstrate a Rowhammer-based attack that leaks keys 
from Opens SH while only Hipping bits in memory locations 
the attacker is allowed to modify (Section VII). 

• Finally, we demonstrate RAMBleed against ECC memory, 
highlighting security implications of successfully-corrected 
Rowhammer-induced bit flips (Section VIII). 

B. Responsible Disclosure 

Following the practice of responsible disclosure, we have 
notified Intel, AMD, OpenSSH, Microsoft, Apple, and Red 
Hat about our findings. The results contained in this paper (and 
in particular our memory massaging technique) were assigned 
CVE-2019-0174 by Intel. 

C. Related Works 

Security Implications of Rowhammer. The potential for 
sporadic bit flips was well known in the DRAM manufacturing 
industry, but was considered a reliability issue rather than a 
security threat. Kim et al. [34] were the first to demonstrate a 
reliable method for inducing bit flips by repeatedly accessing 
pairs of rows in the same bank. Subsequently, Seaborn and 
Dullien [55] showed that Rowhammer is a security concern 
by using Rowhammer-induced flips to break out of Chrome’s 
Native Client sandbox [67] and obtain root privileges. 

Since the initial Rowhammer-based exploit of [55], re¬ 
searchers have demonstrated numerous other avenues for 
Rowhammer exploitation. Gruss et al. [19] demonstrated 
that page-table bits can be dipped via Rowhammer from 
JavaScript, while Bosman et al. [7] dipped the types of 
JavaScript objects through the browser. Aweke et al. [2] also 
demonstrated Rowhammer dips without the use of CLFLUSH, 
and with a halved DRAM refresh interval. Van Der Veen 
et al. [61] used Rowhammer to gain root on mobile phones, 
while Lipp et al. [38] and Tatar et al. [59] used network 
requests to induce Rowhammer dips via a completely remote 
attack. Frigo et al. [14] managed to induce bit dips from the 
browser’s interface to the GPU. ECC memory was shown to 
be vulnerable to Rowhammer by Cojocar et al. [13]. 

Lou et al. [41] systematically categorize Rowhammer at¬ 
tacks in a framework to better understand the problem and 
uncover new types of Rowhammer attacks. Their methodology. 


however, is limited and completely ignores the possibility of 
using Rowhammer as a read side channel. 

Defenses. Various defenses have been proposed for Rowham¬ 
mer attacks, aiming to detect ongoing attacks [2, 12, 20, 27, 
47, 69], neutralize the effect of bits being dipped [19, 61], or 
eliminate the possibility of Rowhammer bit dips in the first 
place [8, 31, 33, 34]. 

IT Background 

This section provides the necessary background on DRAM 
architecture, the row-buffer timing side channel described by 
Pessl et al. [49], and the Rowhammer bug. We begin by briefly 
overviewing DRAM organization and hierarchy. 

A. DRAM Organization 

DRAM Hierarchy. DRAM (dynamic random access mem¬ 
ory) is organized in a hierarchy of cells, banks, ranks, and 
DIMMs, which are connected to one or more channels. 

More specifically, at the lowest level DRAM stores bits in 
units called cells, each consisting of a capacitor paired with 
a transistor. The charge on the capacitor determines the value 
of the bit stored in the cell, while the transistor is used to 
access the stored value. For true cells, a fully charged capacitor 
represents a ‘1’ and a discharged capacitor represents a ‘0’ 
while the opposite holds true for anti cells. 

Memory cells are arranged in a grid of rows and columns 
called a hank. Cells in each row are connected via a word 
line, while cells in each column are connected across bit lines. 
Banks are then grouped together to form a rank, which often 
corresponds to one side of a DIMM. Each DIMM is inserted, 
possibly with other DIMMs, into a single channel, which is a 
physical connection to the CPU’s memory controller. 

DRAM Operation. Access to a DRAM bank operates at 
a resolution of a row, typically consisting of 65536 cells, or 
8KiB. To activate a row, the memory controller raises the 
word line for the row. This produces minute currents on the 
bit lines, which depend on the charge in the cells of the active 
row. Sense amplifiers capture these currents at each column 
and amplify the signal to both copy the logical value of the 
cell into a latch and refresh the charge in the active row. Data 
can then be transferred between the CPU and the row buffer, 
which consists of the latches that store the values of the cells 
in the active row. 

Over time, the charge in the cell capacitors in DRAM leaks. 
To prevent data loss through leakage, the charges need to be 
refreshed periodically. Refreshing is handled by the memory 
controller, that ensures that each row is opened at least once 
every refresh interval, which is generally 64 ms [30] for 
DDR3 and DDR4. LPDDR4 defines temperature-dependent 
adaptations for the refresh interval [29]. 

DRAM Addressing. Modern memory controllers use a com¬ 
plex function to map a physical address to the correct physical 
location in memory (i.e., to a specific channel, DIMM, rank, 
bank, row, and column). While these functions are proprietary 
and undocumented for Intel processors, they can be reverse 
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Fig. 1: Reverse engineered DDRS single channel mapping (2 
DIMM per channel) for Ivy Bridge / Has well (from [49]). 
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Fig. 2: Different hammering techniques as presented by [21]. 
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engineered through both software- and hardware-based tech¬ 
niques [49]. For example, Section II-A shows the DRAM 
mapping for a typical configuration found in Ivy Bridge and 
Has well systems. As the figure shows, the bank and the rank 
are determined based on bits 13-21 of the physical address. We 
have verified that the mapping matches the Haswell processor 
we use in our experiments. 

Row Addressing. As discussed above, DRAM rows have a 
fixed size of typically 8KiB. However, from the implementa¬ 
tion side, it is usually more important to know what amount of 
memory has the same row index. This is sometimes referred to 
as same-row [19, 55]. If the address goes to the same row and 
the same bank, it is called same-row same-bank; if it goes 
to different banks but has the same row index, it is called 
same-row different-bank [55]. 

In our experimental setup, we have a total of 32 DRAM 
banks, and thus an aligned block of 256 KiB = 2^^B of 
memory has the same row index. In other words, the row 
index on our system is directly determined by bits 18 and 
above of the physical address. Pessl et al. [49] provide a more 
extensive discussion. 

B. Row-Buffer Timing Side Channel 

Opening a row and loading its contents into the row buffer 
results in a measurable latency. Even more so, repeatedly 
alternating accesses to two uncached memory locations will 
be significantly slower if these two memory locations happen 
to be mapped to different rows of the same bank [49]. In 
Section V, we use this timing difference to identify virtual 
addresses whose contents lie within the same bank, and 
also uncover the lower 22 physical addressing bits, thereby 
enabling double-sided Rowhammer attacks. 

C. Rowhammer 

The trend towards increasing DRAM cell density and de¬ 
creasing capacitor size over the past decades has given rise to 
a reliability issue known as Rowhammer. Specifically, repeated 
accesses to rows in DRAM can lead to bit flips in neighboring 
rows (not only the direct neighbors), even if these neighboring 
rows are not accessed [34]. 

The Root Cause of Rowhammer. Due to the proximity of 
word lines in DRAMs, when a word line is activated, crosstalk 
effects on neighboring rows result in partial activation, which 
leads to increased charge leakage from cells in neighboring 
rows. Consequently, when a row is repeatedly opened, some 


cells lose enough charge before being refreshed to drop to an 
uncharged state, resulting in bit flips in memory. 

Performing Uncached Memory Accesses. A central re¬ 
quirement for triggering Rowhammer bit flips is the capability 
to make the memory controller open and close DRAM rows 
rapidly. For this, the adversary needs to generate a sequence 
of memory accesses to alternating DRAM rows that bypass 
the CPU cache. Several approaches have been suggested for 
bypassing the cache. 

• Manually Flush Cache Lines. The x86 instruction 
set provides the CLFLUSH instruction, which flushes the 
cache line containing its destination address from all of 
the levels of the cache hierarchy. Crucially, CLFLUSH only 
requires read access to the flushed address, facilitating 
Rowhammer attacks from unprivileged user-level code. On 
ARM platforms, prior to ARMv8, the equivalent cache line 
flush instruction could only be executed in kernel mode; 
ARMv8 does, however, offers operating systems the option 
to enable an unprivileged cache line flush operation. 

• Cache Eviction. In cases where the CLFLUSH instruction 
is not available (e.g. in the browser), an attacker can force 
contention on cache sets to cause cache eviction [2, 19]. 

• Uncached DMA Memory. Van Der Veen et al. [61] report 
that the cache eviction method above is not fast enough to 
cause bit flips on contemporary ARM-based smartphones. 
Instead, they used the Android ION feature to allocate 
uncacheable memory to unprivileged userspace applications. 

• Non-temporal instructions. Non-temporal load and store 
instructions direct the CPU not to cache their results. Avoid¬ 
ing caching means that subsequent accesses to the same 
address bypass the cache and are served from memory [50]. 

Another important distinction between Rowhammer attacks is 
the strategy in which DRAM rows are activated, i.e., how 
aggressor rows are selected. See Figure 2. 

Double-sided Rowhammer. The highest amount of 
Rowhammer-induced bit flips occur when the attacker ham¬ 
mers, that is repeatedly opens and closes, the two rows 
adjacent to a target row. This approach maximizes the number 
of neighboring row activations, and consequently the charge 
leakage from the target row (Figure 2a). However, for double¬ 
sided hammering, the attacker needs to locate addresses in the 
two adjacent rows, which may be difficult without knowledge 
of the physical addresses and their mapping to rows. Previous 
attacks exploited the Linux pagemap interface, which maps 
virtual to physical addresses. However, to mitigate the Seaborn 
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and Dullien [55] attack, recent versions of Linux only allow 
root access to the pagemap interface. 

Another avenue used by previous works for finding adjacent 
rows is to use huge pages, e.g., transparent huge pages (THP), 
to obtain large blocks of physically contiguous memory [19]. 
Single-sided Rowhammer. To avoid the need for finding 
the two rows adjacent to the target row, an adversary can take 
a more opportunistic approach, which aims to cause bit flips 
in any row in memory (Figure 2b). This can be achieved by 
guessing several addresses at random, e.g., 8 addresses, in the 
hope that some fall within two rows in the same bank. With 
B banks, the probability of having at least one such a pair is 
1 — nr=i 61.4% for 8 addresses and 32 banks. 

Alternatively, the adversary can take a more disciplined 
approach and use the row-buffer timing channel (Section II-B) 
to identify rows in the same bank [6, 61]. 

Because only one of the rows being hammered is located 
near the target row, single-sided Rowhammer results in fewer 
bit flips than double-sided Rowhammer [2]. 

One-location Rowhammer. Finally, one-location hammer¬ 
ing [21], is the least restrictive strategy, but also generates 
the fewest number of bit flips (Figure 2c). Here, the attacker 
repeatedly flushes and then reads from a single row. The 
presumed cause of flips, in this case, is that newer memory 
controller policies automatically close DRAM rows after a 
small amount of time. This obviates the need to open different 
rows in the same bank. 

D. RSA Background 

As the end-to-end attack described in this paper recovers 
RSA private keys from an OpenSSH server, we now briefly 
overview the RSA [53] cryptosystem and signature scheme. 

A user creates an RSA key pair by first generating two 
random primes, p and q, a public exponent e, and a private 
exponent d such that e • d = 1 (mod (p — l)(g — 1)). The 
public key is then set to be (e, N) where N = pq, and the 
private key is set to be (d, N). To sign a message m, the signer 
uses its private key to compute a ^ mod N, where 2 : is a 
collision resistant hashing of m. To verify a signature cr, the 
verifier first hashes the message by herself and obtains a digest 
z'. She then computes z" <r- mod N using the public key 
and verifies that z' = z", and rejects the signature otherwise. 
The Chinese Remainder Theorem. A common optimization 
used by most applications to compute a ^ z^ mod N is the 
Chinese Remainder Theorem (CRT). Here, the private key 
is first augmented with dp ^ d mod (p — 1) and dq ^ 
d mod {q — 1). Next, instead of computing z^ mod N directly, 
the signer computes cjp ^ z^p mod p and cFq ^ z^^ mod q. 
Finally, the signer computes a from cFp and (jq using the CRT. 
Partial Key Recovery. Cryptographic keys recovered through 
a side channel are typically subject to some measure of noise. 
Often only a fraction of the key bits are recovered, and their 
values are not known with certainty. Various researchers have 
exploited the redundancy present in private key material to 
correct the errors [5, 25, 44, 46, 48, 66]. 


HI. Threat Model 

We assume an attacker that runs unprivileged software 
within the same operating system (OS) as the victim software. 
The OS maintains isolation between the victim program and 
the attacker. In particular, we assume that the OS works 
correctly. We further assume that the attacker cannot exploit 
microarchitectural side channel leakage from the victim, either 
because the victim does not leak over microarchitectural 
channels or because the OS enforces time isolation [16]. We 
do assume that the machine is vulnerable to the Rowhammer 
attack. However, we assume that the attacker only changes 
its own private memory to bypass any countermeasures and 
detection mechanisms. Finally, we assume that the attacker is 
able to somehow trigger the victim to perform allocations of 
secret data (for example using an incoming SSH connections 
for the OpenSSH attack in Section VII). 

IV. RAMBleed 

Previous research mostly considers Rowhammer as a threat 
to data integrity, allowing an unprivileged attacker to modify 
data without accessing it. With RAMBleed, however, we 
show that Rowhammer effects also have implications on data 
confidentiality, allowing an unprivileged attacker to leverage 
Rowhammer-induced bit flips in order to read the value of 
neighboring bits. Furthermore, as not every bit in DRAM can 
be flipped via Rowhammer, we also present novel memory 
massaging techniques that aim to locate and subsequently 
exploit Rowhammer fiippable bits. This enables the attacker to 
read otherwise inaccessible information such as secret key bits. 
Finally, as our techniques only require the attacker to allocate 
and deallocate memory and to measure instruction timings, 
RAMBleed allows an unprivileged attacker to read secret data 
using the default configuration of many systems (e.g., Ubuntu 
Linux), without requiring any special configurations (e.g., 
access to pagemap, huge pages, or memory deduplication). 

A. The Root Cause of RAMBleed. 

RAMBleed exploits a physical phenomenon in DRAM 
DIMMs wherein the likelihood of a Rowhammer induced 
bit flip depends on the values of the bits immediately above 
and below it. Bits only flip when the bits both immediately 
above and below them are in their discharged state [13]. 
This is in agreement with observations by Kim et al. [34] 
that hammering with a striped pattern, where rows alternate 
between all zeros and all ones, generates many more flips than 
with a uniform pattern. 

Data-Dependent Bit Flips. Put simply, bits tend to flip to 
the same value of the bits in the adjacent rows. That is, a 
charged cell is most likely to flip when it is surrounded by 
uncharged cells. This is likely due to capacitors of opposite 
charges inducing parasitic currents in one another, which cause 
the capacitors to leak charge more quickly [3]. For our attack 
to work, is it also crucial that bit flips are influenced only by 
bits in the same column, and not by the neighboring bits within 
the same row. This isolation is what allows us to deduce one 
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bit at a time. Cojocar et al. [13] experimentally demonstrate 
this to be the case. 

A Toy Example. To illustrate with a concrete example, we 
introduce the notation of an x-y-z configuration to describe the 
situation in which three adjacent bits in the same column have 
the values x, y, and z, respectively, where x^y^z G {0,1}. The 
key reasoning behind our attack is as follows. 

• True Cells. For cells where a one-valued bit is represented 
as the cell being charged, the 0-1-0 configuration is the 
most likely to flip, changing to an all zero configuration (0- 
0-0) when rows of the first and the last zero-valued cells 
are hammered. In this case, the surrounding zero-bits in the 
aggressor rows enable the bit flip in the victim row. 

• Anti Cells. For cells where a one-valued bit is represented 
by an uncharged cell, a 1-0-1 configuration is more likely 
to flip and change to an all one configuration (1-1-1) when 
rows of the first and the last one-valued cells are hammered. 

Notation. We adopt Cocojar et al.’s [13] terminology of 
calling 0-1-0 and 1-0-1 configurations “stripe” patterns, and 
naming 1-1-1 and 0-0-0 configurations “uniform” patterns. 
Given this data dependency, we now proceed to build a read 
side channel in which we read the bits in surrounding rows 
by observing flips, or lack thereof, in the attacker’s row. 

B. Memory Scrambling 

One potential obstacle to building our read channel is 
that modern memory controllers employ memory scrambling, 
which is designed to avoid circuit damage due to resonant 
frequency [68] as well as to serve as a mitigation to cold- 
boot attacks [22]. Memory scrambling applies a weak stream 
cipher to the data prior to sending it to the DRAM. That is, 
the memory scrambler XORs the data with the output of a 
pseudo-random number generator (PRNG). The seed for the 
PRNG depends on the physical address of the data and on a 
random number generated at boot time [26, 43]. The PRNG 
is cryptographically weak, and given access to the physical 
data in the DRAMs, an adversary can reverse engineer it and 
recover the contents of the memory [4, 68]. 

Bypassing Memory Scrambling. Under our threat model 
we cannot use the techniques of Yitbarek et al. [68], as we do 
not assume physical access. However, we can take advantage 
of the weaknesses of the PRNG. In particular. The boot-time 
random seed is identical for all rows, and the physical address 
bits included in the seed are such that several adjacent rows 
can have the same bits in their addresses. Thus, adjacent 
rows typically use the same seed, and have the same mask 
applied. Applying the same mask across multiple rows means 
that adjacent bits either remain unchanged or are all inverted. 
Either way, as observed by [13], striped configurations remain 
striped after scrambling. Hence, writing a striped configuration 
to memory results in a striped configuration appearing in the 
DIMM, maintaining the crucial property that a bit will only 
flip if the bits immediately above and below have the opposite 
value. 


C. Exploiting Data-Dependent Bit Flips 

We now show how to exploit the data-dependent bit flips 
presented above to read data without accessing it. 

A Leaky Memory Layout. We begin by considering the 
memory layout presented in Figure 3a, where every DRAM 
row contains two 4KiB pages. In this layout, we assume that 
AO, Al, and A2 are the attacker-controlled pages containing 
known data, S is a page with the victim’s secret, and RO is 
an arbitrary page. All three rows reside in the same bank. 
Next, note that attacker pages AO and A2 reside in the same 
rows as the copies of S. Since DRAM row-buffers operate 
at an 8 KiB granularity, accessing a value in AO activates the 
entire first row, including the page containing the secret S. 
Similarly, accessing a value in A2 activates the entire third row, 
again including the page that contains S. Thus, by repeatedly 
accessing AO and A2, the attacker can indirectly use the victim 
pages containing S for hammering, despite not having any 
permissions to access them. 

Hammering. By hammering the attacker-controlled pages AO 
and A2, the attacker induces analog disturbance and interaction 
between S and Al. Page Al also belongs to the attacker, who 
can therefore detect bit flips in it. From these bit flips, the 
attacker can infer the values of bits in S. 

Reading Secret Bit Values. Given a page P, we denote 
by P[i] the i-th bit in P, where i G {0,1,..., 32766, 32767}. 
At a high level, given a known fiippable bit Al[i] in the page 
Al, we can read the corresponding bit S[i] (i.e., the bit at the 
same offset within the frame) in S as follows: 

1) Initialize. Assuming that the bits are true cells, the attacker 
first populates all of Al with ones before hammering. 

2) Hammer. The attacker repeatedly reads her own pages 
AO and A2, thereby using the victim’s secret-containing 
pages to perform double-sided hammering on Al. 

3) Observe. After hammering, the attacker reads the value of 
the bit Al[i], which is accessible to her because the page 
Al is located inside the attacker’s own private memory 
space. We argue that after hammering, the value of Al[i] 
is equal to the value of S[i]. Indeed, if S[i] equals 0, then 
before hammering Al[i] would have been in the center of a 
0-1-0 stripe configuration. Since Al[i] sits in the center of 
a fiippable stripe configuration, Al[i] will flip from one to 
zero after hammering. Conversely, if S[i] equal to 1, then 
Al[i] will be in the center of a 1-1-1 uniform configuration, 
and will retain its value of 1 after hammering. Thus, in 
both cases, the attacker reads Al[i] from her own private 
memory after hammering, which directly reveals S[i]. 

Double-sided RAMBleed. In the case of anti-cells, the 
only change we make is that in step 1, we populate Al with 
zeros instead of ones. Thus, by observing bit flips in her own 
pages, the attacker can deduce the values of surrounding cells. 
Since the secret S surrounds Al from both sides, we call this 
“double-sided RAMBleed”. 

Single Sided RAMBleed. Figure 3b presents the memory 
layout for what we call “single-sided RAMBleed”, which 
differs from the double-side case only in the bottom right 
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8KiB 


^Row Activation Page (A0)| 
I Unused (RO) 
^ Row Activation Page (A2)| 


Secret (S) 


Sampling Page (Al) 


Secret (S) 


(a) Double-sided Rambleed. Here, the sampling page (Al) is 
sandwiched between two copies of S. 


Row Activation Page (AO) 

Secret (S) 

1 Unused (RO) 

Row Activation Page (A2) 

Sampling Page (Al) 
Unused (Rl) 


(b) Single-sided Rambleed. Here, the sampling page (Al) is 
neighbored by the secret-containing page (S) on a single side. 


Fig. 3: Page layout for reading out the victim’s secret. Each cell represents a 4KiB page, meaning that each row represents 
an 8 KiB row in a DRAM bank. The attacker repeatedly accesses her row activation pages AO and A2, activating the top and 
bottom rows. She then reads out corresponding bits in page S by observing bit flips in the sampling page Al. 


frame; instead of another copy of S, an arbitrary page R1 
resides below Al. With this configuration, we can still read 
out bits of S by following the same steps as in the double¬ 
sided scenario, albeit with reduced accuracy. The reduction 
in accuracy is because the value of Rl[i] may differ from 
that of S[i]. Assuming a uniform distribution of bits in Rl, 
in half of the cases, the starting configuration is one of 1- 
1-0 and 0-1-1, which are neither striped nor uniform. With 
such configurations, bits tend to flip less than with striped 
configurations introducing uncertainty to the read values. Yet, 
in half of the cases Rl[i]=S[i], resulting in the same outcome 
as for the double-sided RAMBleed scenario. 

While double-sided RAMBleed maximizes the disturbance 
interactions between the secret bits and Al, it is also more 
challenging to execute in practice because it requires two 
copies of the same data in memory. Nevertheless, in Sec¬ 
tion VII we show how an attacker can reliably obtain two 
copies of S, demonstrating an end-to-end attack on OpenSSH. 

V. Memory Massaging 

The descriptions from Section IV assume that the attacker can 
place the victim’s secrets in the layout shown in Figure 3, 
where A0-A2 are allocated to the attacker, and that the 
attacker knows which bits can flip and in which direction. We 
now present novel memory massaging primitives that achieve 
both goals without requiring elevated permissions or special 
operating system configuration settings (i.e., avoiding huge 
pages, page map access, memory deduplication). 

A. Obtaining Physically Consecutive Pages 

As we can see in Figure 3, the attack requires pages located 
in three consecutive 8 KiB rows in the same bank. While 
this task was previously achieved using the Android ION 
allocator [61], no such interface is available in non-Android 
Linux. Instead, we exploit the Linux buddy allocator [17] to 
allocate a 2MiB block of physically consecutive memory. As 
the same-row-index size (See Section II-A) on our system is 
256 KiB, we are guaranteed to be able to build the layout of 
Figure 3 using some of the pages in the block provided by 
the allocator. We now proceed to provide a short overview of 
Linux’s buddy allocator. See Gorman [17] for further details. 
Linux Buddy Allocator. Linux uses the buddy allocator to 
allocate physical memory upon requests from userspace. The 


kernel stores memory in physically consecutive blocks that 
are arranged by order, where the nth order block consists of 
4096 • 2^ physically consecutive bytes. The kernel maintains 
free lists for blocks of orders between 0 and 10. To reduce 
fragmentation, the buddy allocator always attempts to serve 
requests using the smallest available blocks. If no small block 
is available, the allocator splits the next smallest block into 
two “buddy” halves. These halves are coalesced into one block 
when they are both free again. 

The user space interface to the buddy allocator, however, 
can only make requests for blocks of order 0. If, for example, 
a user program requests 16 KiB, the buddy allocator treats 
this as four requests for one 4 KiB block each. This means 
that irrespective of their size, user space requests are first 
handled from the free list of 0 order blocks. Only once the 
allocator runs out of free 0 order blocks, it will start serving 
memory requests by splitting larger blocks to generate new 0 
order blocks. Thus, while obtaining a virtually consecutive 
2MiB block is trivial and only requires a single memory 
allocation, obtaining a physically consecutive block requires 
a more careful strategy, which we now describe. 

Obtaining a Physically Consecutive 2 MiB Block. We now 
exploit the deterministic behavior of the buddy allocator to 
coerce the kernel into providing us with physically consecutive 
memory, using the following steps: 

• Phase 1: Exhausting Small Blocks. First, we al¬ 
locate memory using the mmap system call with the 
MAP_POPULATE fiag, which ensures that the kernel ea¬ 
gerly allocates the pages in physical memory, instead of 
the default lazy strategy that waits for them to be accessed 
first. Next, we use the /proc/pagetypeinfo interface 
to monitor available block sizes in the kernel free lists, and 
continue to allocate memory until less than 2 MiB of free 
space remains in blocks of order less than 10. 

• Phase 2: Obtaining a Consecutive 2 MiB Block. Once 
free space in blocks of order below 10 is less then 2MiB, 
we make two requests of size 2 MiB each. Thus, to serve the 
first request after exhausting the smaller blocks, the kernel 
needs to split one of the 10th order blocks (whose size 
is 4 MiB each). This leaves more than 2 MiB in the free 
list, where all such space comes from the newly-split 4 MiB 
block, and is served in-order. Thus, the memory allocated for 
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the second request consists of consecutive physical memory 

blocks, which is exactly what we require. 

While the region we obtain in the second allocation is phys¬ 
ically consecutive, this approach does not guarantee that the 
obtained area will be 2 MiB-aligned in the physical memory. 
Thus, to use the obtained region for Rowhammer, we require 
an additional step to recover more information about the 
physical address of the obtained 2MiB region.^ 

Recovering Physical Addressing Bits. Next, for double¬ 
sided hammering, we need to locate addresses in three con¬ 
secutive rows within the same bank. As some of the physical 
address bits of the 2 MiB block are used for determining the 
banks of individual 4KiB pages, we must somehow obtain 
these addressing bits for every 4 KiB page in our block. 

Since 2MiB= 2^^ bytes, and our 2 MiB block is physically 
sequential, obtaining the low 21 bits of the physical addresses 
amounts to finding the block’s offset from being 2 MiB aligned 
(where the low 21 bits are 0). In older Linux kernels, an 
attacker could use the pagemap interface to translate virtual 
addresses to physical addresses. However, in the current Linux 
kernel, the interface requires root privileges due to security 
concerns [55]. Instead of using the pagemap interface, we 
exploit the row-buffer timing channel of Pessl et al. [49] to 
recover the block offset. 

Computing Offsets. To find a block’s offset from a 2 MiB 
aligned address, we take advantage of the fact that our 2 MiB 
block is physically contiguous and that the set of distances 
between co-banked addresses uniquely defines the block’s 
offset. Figure 4 illustrates this concept. The blue block is a 
2 MiB aligned block originally found in the fragmented order 
10 block, while the red, 2 MiB unaligned block is the region 
we have obtained from our attack on the allocator. The colored 
vertical stripes are 4 KiB pages, where two pages of the same 
color indicate that they reside in the same bank. 

The distances di, i G {0,1, 2,..., n} are the differences 
between the addresses of the i-th page in our block and the 
very next address located in the same bank. Together, the set 
{do, di, ^ 2 ,..., dn} forms a distance pattern for our block. 
There are 512 possible offsets for a 4 KiB page within a 2 MiB 
block; simulations of DRAM addressing confirm that these 
patterns uniquely identify the block’s offset. 

Recovering Distance Patterns. We can now use Pessl et 
al.’s [49] row-buffer timing side channel to find the distances 
{do, • • • , dn} between pages located in the same bank. Once 
we have uncovered enough of the distance pattern to uniquely 
identify a single offset, we have succeeded in computing the 
offset of our 2 MiB block. This typically occurs after finding 
fewer than ten distances. 

We compute a distance di by alternating read accesses 
betweenandpj for j G {i-fl, i+2,..., i+2n—2, i+2n—1}, 
where Pi is the page at the i-th offset within the block, and 

^The more naive strategy of first exhausting all smaller blocks and then 
using one larger request in the hope that it is served from a single large 
block tends not to work in practice. Any block of order 0 released during the 
exhaustion phase will be recycled before splitting the large block and will 
result in a non-consecutive allocation. 



Fig. 4: The blue block is the 2 MiB aligned block that was 
originally found in the fragmented order 10 block, while the 
red, 2 MiB unaligned block is the block we have obtained from 
our attack on the allocator. We compute the offset by finding 
the distances between co-banked pages di, i G {0,l,2,..,n}, 
which uniquely identify the offset. 

n is the number of pages with the same row index. We then 
time how long it takes to access both addresses, and average 
the results over 8,000 trials; the page that corresponds to the 
greatest read time is identified as residing in the same bank as 
Pi. The distance di is then equal to the difference in the page 
offset between the two. 

The reason we search over the next two rows of any bank 
(i.e., 512 KiB), and not just the next, is that the nature of 
the DRAM addressing scheme means that the two co-banked 
pages in consecutive rows can potentially lie anywhere within 
the memory range with the same row index. When we compute 
the distances, we make use of Schwarz’s [54] optimizations 
for confusing the memory controller to obtain accurate timing 
measurements. We empirically find over many trials that this 
method works with a 100% success rate. 

Recovering Bit 21. So far, we have uncovered bits 0-20 
of the physical address. As Pessl et al. [49] show, however, 
DRAM addressing on our system depends on bits 0-21. The 
naive solution is to simply adjust our attack on the memory 
allocator to obtain a physically contiguous 4 MiB block. This 
solution, however, is infeasible as the buddy allocator does not 
track 8 MiB blocks, and thus cannot split an 8 MiB block into 
two contiguous 4 MiB blocks. Another solution is to simply 
guess the value of bit 21, doubling the attack’s running time. 

We can, however, overcome this through an insight into the 
DRAM addressing scheme. On our system (a Has well machine 
with two DIMMs on a single channel) there are three bank 
addressing bits used to select between the eight banks within 
a single rank. As specified by [49], bit 21 is only used for 
computing the third bank addressing bit by XORing bits 17 
and 21 of the physical address. Thus, to find two physical 
addresses a^, located in the same bank in consecutive 8 KiB 
rows, we need to ensure that 

0 021 — ^17 ® ^21 

where aj is the j-th least significant bit in the i-th physical 
addresses Then, given a physical address oP in the 

2 MiB block, when we want to find another physical address 

in the same bank, but located in the row above. First we set 
oP to be oP plus the size until the next row index. Then, we 
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adjust ajy to preserve the above equation. Even though we do 
not know a^i nor ah, we can examine bits 0 till 20 in to 
see if the addition of the size of row index done for computing 
had resulted in a carry for bit 21. If so, we compensate by 
flipping ajy in order to preserve the above equation. 

B. Memory Templating 

After obtaining blocks of contiguous memory, we proceed 
to search them for bits that can be flipped via Rowhammer. 
We refer to this as the templating phase, which is performed 
as follows. We first use our technique to obtain 2MiB blocks 
of physically contiguous memory. Then, we locate addresses 
that belong to the same bank using the method described 
above. Next, we perform double-sided hammering with both 
1-0-1 and 0-1-0 striped configurations. Finally, we record the 
locations of these flips for later use with RAMBleed. 

C. Placing Secrets Near Flippable Bits 

After templating memory, we exploit the determinism of the 
Linux physical memory allocator to place the victim’s page in 
the desired physical locations as outlined in Figure 3. While a 
similar task was achieved in [61] on Android’s ION allocator 
by exhausting most of the available memory to control the 
placement of the victim, we achieve the same result on Linux’s 
buddy allocator without memory exhaustion. Following the 
convention of [61] [51] [5 8], we call this technique “Frame 
Feng Shui”, as we are coercing the allocator into placing select 
pages into a frame of our choosing. 

Exploiting Linux’s Buddy Allocator. The buddy allocator 
stores blocks of equal order in a flrst-in-last-out (FIFO) stack¬ 
like data structure, and upon receipt of a request of order n, 
the allocator returns the most recently freed block from the 
n-th order’s bucket. Thus, if we assume that the victim, after 
being triggered, allocates a predictable number of pages before 
allocating the secret-containing page, we can force Linux’s 
memory allocator to place the victim’s secret containing page 
in a page frame of our choice by the following: 

• Step 1: Dummy Allocations. The attacker allocates n 
4KiB pages by calling mmap with the MAP_POPULATE 
flag, where n is the number of pages that the victim will 
allocate before allocating its secret containing page. 

• Step 2: Deallocation. The attacker inspects her own 
address space and chooses the target page frame for the 
victim’s secret to land on (one that neighbors the flippable 
bits). Next, the attacker calls munmap and deallocates 
the selected frame. The attacker then immediately unmaps 
all the pages mapped during Step 1. After doing so, the 
allocator’s stack-like data structure for the 0th order blocks 
will have the n pages on top, followed by the target page. 

• Step 3: Triggering the Victim. After Steps 1 and 
2, the attacker immediately triggers the victim process, 
letting it perform its memory allocations. In Section VII, 
we accomplish this by initiating an SSH connection, which 
is served by the SSH daemon. After being triggered, the 
victim allocates n pages, which then land in the frames 
vacated by the pages mapped in Step 1. Finally, the victim 


allocates its secret-containing page, which then lands in the 

desired frame, as it will be located on top of the allocator’s 

stack-like data structure for 0th order blocks at this point. 

D. Putting It All Together 

With the above techniques in place, we can now describe 
our end-to-end attack, which consists of two phases. 

Offline. The attack starts by allocating 2MiB blocks and 
dividing them into physically consecutive pages as described 
in Section V-A. The attacker then templates her blocks and 
locates Rowhammer induced bit flips using the methodology 
described in Section V-B. Notice that this phase is done offline, 
entirely within the attacker’s address space, and without any 
interaction with the victim. Finally, after the attacker obtains 
enough Rowhammer induced bit flips to read the victim’s 
secret, the attacker begins the online phase described below. 
Online. In this step, the attacker uses Frame Feng Shui to get 
the victim to place his secret in the physical memory locations 
desired by the attacker (e.g., using the layout in Figure 3). 
The attacker then performs the RAMBleed attack described in 
Section IV-C to exploit the data-dependency with the victim’s 
bits, and subsequently deduces some of their values. Finally, 
the attacker repeats the online phase step until a sufficient 
number of secret bits where leaked from the victim (e.g., 
around 66% percent of the victim’s RSA secret key, which 
is sufficient to mathematically recover of the remaining bits). 

VI. Experimental Evaluation 

To measure RAMBleed’s capacity as a read side channel, 
we measure the rate and accuracy of RAMBleed’s ability 
to extract bits across process boundaries and address spaces 
under ideal conditions and predictable victim behavior. 

Next, after evaluating both double-sided and single-sided 
RAMBleed, in Section VII we evaluate RAMBleed against 
an OpenSSH 7.9 server (which is a popular SSH server), 
extracting the server’s secret RSA signing keys. 

The Victim Process. In the proof-of-concept victim code, 
the victim waits for an incoming TCP connection, and then 
copies the secret key into a freshly allocated page (using an 
anonymous mmap) upon each TCP connection request. This 
behavior is akin to a server that runs a decryption routine every 
time the attacker makes a request, thereby using its secret key. 
The Attacker Process. The attacking process uses the tech¬ 
niques described in Section V-A to obtain 2MiB physically 
consecutive blocks, and subsequently templates memory for 
flippable cells using the methods outlined in Section V-B. 
Finally, the attacker uses Frame Feng Shui to place the secret- 
containing page above and below a flippable bit (for single¬ 
sided, we only place it above). Concretely, we accomplish this 
by unmapping the target location and then initiating a TCP 
connection with the victim. Since n = 0 in this case, meaning 
that the secret is the first allocation upon context switching, 
the secret-containing page should land in the recently vacated 
frame. The attacker then hammers the surrounding rows and 
leaks the secret bits by reading out the flips from its own page. 
We run both processes as taskset with the same CPU affinity. 
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Type 

Read Accuracy Percents 

Overall 

False Positive 

False Negative 

Double-sided 

90% 

5% 

15% 

Single-sided 

74% 

19% 

29% 


TABLE I: “false positive” events, where a uniform configu¬ 
ration still flips are more rare than “false negative” events, in 
which a striped configuration refuses to flip. 


Hardware. We use an HP Prodesk 600 desktop running 
Ubuntu 18.04, featuring an i5-4570 CPU and two Axiom 
DDRS 4GiB 1333 MHz non-ECC DIMMs, model number 
51264Y3D3N13811, in a single-channel configuration. 
Experimental Results. While [13] report that bit flips are 
deterministic with regards to the surrounding bits (i.e. a bit 
flips if and only if it is in a striped configuration), on our 
systems we observe the more general case where the bit flips 
are probabilistic. Next, the probability of a bit flip highly 
depends on the type of configuration (striped or uniform). This 
uncertainty adds noise to our read-channel, which we handle 
with a variant of the Heninger-Shacham technique [24]. 
Memory Templating. The time required to template memory 
and find the needed flips is entirely dependent upon how easily 
the underlying DIMMs yield bit flips. While [37] and [21] 
report finding thousands of flips within minutes, we found 
flips at a more modest rate of 41 flips per minute. 

Reading Secret Bits. After templating the memory with 
a striped 0-1-0 pattern, our experimental code can read out 
the victim’s secret at a rate of 3-4 bits/second. As we can 
see from the results in Table I, this works with 90% accuracy 
overall, and 95% accuracy when it comes to identifying 1-bits. 
This is because “false positive” events, that is, when a 1-1- 
1 uniform configuration still results in the center bit flipping 
from one to zero, are much rarer than “false negative” events, 
in which a 0-1-0 stripe refuses to flip. We can then template 
with the opposite stripe pattern (1-0-1) and achieve a 95% 
accuracy rate on the zero-valued bits. 

VH. Attacking OpenSSH 

To demonstrate the practical risk that RAMBleed poses to 
memory confidentiality, in this section we present an end-to- 
end attack against OpenSSH 7.9 that allows an unprivileged 
attacker to extract the server’s 2048-bit RSA private signing 
key. This key is what allows an SSH server to authenticate 
itself to incoming connections. As such, a break of this key 
enables the attacker to masquerade as the server, thereby 
allowing her to conduct man-in-the-middle (MITM) attacks 
and decrypt all traffic from the compromised sessions. 

At a high level, our attack operates by coercing the server’s 
SSH daemon to repeatedly allocate and place its private key 
material at vulnerable physical locations. We then use double¬ 
sided RAMBleed to recover a portion of the bits that make 
up the server’s RSA key. Einally, we utilize the mathematical 
redundancy in RSA keys to correct for errors in extracted 
bits, as well as recover missing bits that we were unable to 
read directly. Before describing our attacks, we now describe 


how OpenSSH manages and uses its keys in response to 
incoming SSH requests, and how we adapted the techniques 
from Section V to specifically target OpenSSH. 

A. Overview of OpenSSH 

The OpenSSH daemon is a root-level process that binds to 
port 22 and has access to a root-accessible file, which stores 
the server’s RSA private key. As shown in Eigure 5, when a 
TCP connection arrives on port 22, the daemon spawns a child 
process that handles the authentication phase of incoming SSH 
connections. The child is responsible for both authenticating 
the server to the client as well as authenticating the client 
to the server. While the latter can be done either via public- 
private key pair, or by supplying a password, the former is 
done by having the server use its RSA private key to sign a 
challenge issued by the client. Einally, once authentication is 
complete, the child process spawns an unprivileged grandchild 
for handling the user’s connection. See Eigure 5. 

Key Memory Management. The child process that is 
spawned by the SSH demon for mutually authenticating an 
incoming SSH request must first read in the server’s private 
key from the key file into a temporary buffer. At this point, 
the key will actually be located in memory in two places: 
namely, the temporary buffer and the OS’ page cache. Un¬ 
fortunately, we cannot read either of these memory locations 
via RAMBleed. Eor the former, this buffer gets overwritten 
immediately, before we have any chance to read even a single 
bit using RAMBleed. The latter copy is also inaccessible as it 
is stored inside the OS’ page cache, which is located in a static 
region of physical memory that is not moved around. Luckily, 
OpensSH’s authentication process then proceeds to copy the 
keys into a new buffer maintained by a global structure, 
aptly named “sensitive_data”. This buffer remains in physical 
memory for the duration of the connection. Thus, our attack 
aims to read the private key material from this structure. 

We now proceed to describe our attack on OpenSSH. 

B. Attack Overview 

Our first step is to profile memory, looking for fiippable 
bits. We do this in the same manner described in Section V-B. 
After finding a sufficient number of flips, we begin the reading 
phase, in which we perform RAMBleed to leak a single bit at 
a time. At a high level, for each templated bit, we use Erame 
Eeng Shui to place private key material in the configuration 
shown in Eigure 3, where A1 is the page containing the 
fiippable bit. We then perform double-sided RAMBleed to leak 
the bit’s value and proceed to the next bit. 

C. Overcoming OpenSSH's Memory Allocation Pattern 

To use Erame Eeng Shui against OpenSSH, we must 
determine the value n, which is the number of pages we 
must unmap after vacating the target frame in order to cause 
OpenSSH to place the secret in the targeted frame location. 
Examining the behavior of OpenSSH 7.9 on our system, we 
found that its allocations pattern is predictable, which allows 
us to use Erame Eeng Shui with a high success rate. More 
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(a) The attacker initially owns both target pages TO and Tl. 



(b) The attacker makes an SSH connection and performs Frame Feng 
Shui to land the secret S in the target page TO, which lies above the 
sampling page (Al). 
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(c) The attacker repeats the Frame Feng Shui process to land S in (d) After achieving the double-sided RAMBleed position, the attacker 
the target page Tl, below the sampling page (Al). now hammers the activation pages (AO and A2) to induce flips in the 

sampling page (Al). 


Fig. 5: Overview of our attack on OpenSSH 


specifically, we found that OpenSSH uses the default RSA 
key size of 2048 bits, with the following allocation pattern. 

• First, the page containing d, the RSA private exponent, is 
allocated 101 pages after the daemon accepts a new TCP 
connection. See Section II-D for RSA notation. 

• Next, a single page containing both p and q is allocated 102 
pages after the daemon accepts a new connection. 

• Finally, a single page that contains both dp and dq is 
allocated 104 pages after accepting a new connection. 

Furthermore, all the private key values mentioned above are lo¬ 
cated at the same offset within their page upon every incoming 
connection. Thus, we fix n = 100,101, and 104 respectively 
for d, p and q, and dp, dq. Next, to obtain the configuration in 
Figure 3, we call munmap on the page above Al and follow 
it with n munmaps on random pages. We then immediately 
make a TCP connection, causing the SSH daemon to make n 
allocations, followed by allocating the secret-containing page, 
which will then be placed in the target frame. By holding the 
TCP connection open, we can repeat the process to place the 
page in the frame below Al, thereby creating two copies of 
the secret in memory to facilitate double-sided RAMBleed. 
Accounting for Allocation Noise. The memory placement 
technique described above is much more susceptible to noise, 
as many CPU cycles pass between the point of the original 
unmapping by the attacker and when the victim maps the 
key-containing page. Thus, if any pages are allocated or 
deallocated in that time frame by another process, the key- 
containing pages will not be placed in the desired locations. 
To minimize this noise, the attacker yields the scheduler before 


performing the page deallocaitons, allowing other scheduled 
system activity to execute. Next, we also use a busy loop after 
unmapping the pages and before reading the bits, waiting a 
fixed amount of time for OpenSSH to perform the required 
allocations. We note here that if we replace the busy loop 
with a sleep operation, this will likely cause the system to 
schedule another process and destroy the memory layout. After 
using RAMBleed to read the bit(s), we close the connections, 
triggering the daemon to kill the two children. 

After mitigating noise in this manner, the memory place¬ 
ment process succeeds against OpenSSH with 83% probability. 
This means that we will be in the double-sided-RAMBleed sit¬ 
uation 0.83^ = 68.89% of the time, in single-sided RAMBleed 
2 • 0.83 • 0.17 = 28.22% of the time, and 0.17^ = 2.39% of 
the time we will be unable to place the target page near the 
flipping row, resulting in random guessing. This, along with 
potential for RAMBleed to misread bits, gives us an overall 
accuracy of 82% when reading the OpenSSH host key. 

Key Recovery. To recover the key from the noisy bits, we 
use a variant of Paterson et al. [46] ’s algorithm, an adaptation 
of the Heninger-Shacham algorithm [24] for the case that key 
bits are only known with some probability. Specifically, the 
algorithm aims to reconstruct the key, bit by bit, starting from 
the least significant bit. By relating the public (A^, e) and 
private (d, p, q, dp, and dq) key components, the algorithm 
prunes potential keys and dramatically reduces the search 
space. The algorithm explores a search tree of potential keys 
while pruning branches that contradict known bits or have a 
large number of mismatches with probabilistically recovered 
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Type 

Probability 

Double-sided RAMBleed 

68.89% 

Single-sided RAMBleed 

28.22% 

Unable to place victim 

2.39% 


TABLE II: Probability of OpenSSH placing pages containing 
private key material into double-sided, single-sided, or unable- 
to-place situations. 

bits. Our approach is similar to Paterson et al. [46], but instead 
uses a depth-first search in place of a bread-first search. 

Through a series of simulations on random RSA 2048 
bit keys, we empirically found that our amended Heninger- 
Shacham algorithm requires 68% recovery of the private key 
material (d, p, g, dp, dq) with an 82% accuracy. This implies 
that 4200 distinct bits of private key material is sufficient to 
extract the complete key. 

D. Overall Attack Performance 

Memory Templating. We begin our attack by locating the 
fiippable bits in the memory of the target machine. Using the 
techniques presented in Sections IV and V, we profiled the 
machine’s memory to locate Rowhammer induced bit flips. We 
note here that the time required to template memory and find 
the required flips is entirely dependent upon the susceptibility 
of underlying DIMMs to Rowhammer attacks. While [21, 37] 
report finding thousands of flips within minutes, we found 
flips at a more modest rate of 41 flips per minute, giving us a 
running time of 34 hours to locate the 84K bit flips required 
for the next phase of the attack.^ 

We note here that this phase can be performed ahead of time 
and with user level permissions, without the need to interact 
with the victim application or its secrets. 

Removing Useless Bits. Next, we note that not all of these 
bitfiips are useful for key extraction. First, given OpenSSH 
memory layout and the location of the key elements in their 
respective pages, only a ^ fraction of the bits 

(corresponding to offsets of d, p, q, dp and dq) are useful for 
key recovery. Out of the 84K bit flips recovered in the previous 
phase, this leaves approximately 15750 bits flips which have 
the potential to reveal bits of the secret key. Next, we note 
that these bit flips also contain repetitions in their locations in 
the page, meaning that two or more bit flips might actually 
correspond to the same bit of the secret key. After removing 
such duplicates, we are left with 4.2K bit flips in distinct 
locations that are useful for key extraction. 

Reading Private Key Material. After placing the key 
containing pages in the desired locations to achieve one of the 
RAMBleed configurations, we then proceed to hammer AO and 
A2 (See Figure 3). We have no way of determining if we are in 
the double-sided, single-sided, or unable-to-place RAMBleed 
situation, but given the probabilities in Section VII-C, it is 
likely that the bit flip in Al will depend upon the secret bit 
values. Overall, this process resulted in recovering 68% of the 

^We empirically found that 84K bit flips was approximately the threshold 
for locating 4200 usable, unique, fiippable bits. 


private key, or 4200 key bits, at a rate of 0.31 bits/second at 
an accuracy rate of 82% against OpenSSH. We conjecture that 
the deceased accuracy is due to the combined noise from both 
the inaccuracy of RAMBleed and Frame Feng Shui. 

Key Recovery. As mentioned above, we recover 68% of the 
key bits with 82% accuracy. Using our amended Heninger- 
Shacham algorithm, we recover the entire RSA private key in 
about 3 minutes on a consumer laptop (Dell XPS 15 featuring 
an Intel 17-6700 3.4 GHz CPU and 32GiB of RAM). 

VHI. RAMBleed on ECC Memory 

In this section we show how to use RAMBleed to read 
secret information stored on DIMMs that use ECC memory. 
Unlike Section IV, which shows how RAMBleed can exploit 
visible bit flips to read secret information, here we show how 
an attacker can exploit bit flips that were successfully corrected 
by ECC to read information from the victim’s address space. 

We begin by providing background on ECC memory. 

A. ECC Memory Background 

Memory manufacturers originally designed ECC memory 
for correcting rare, spontaneous bit flips, such as those caused 
by cosmic rays. As such, ECC memory uses error correcting 
codes that can only correct a small number of bits in a single 
code word, typically only one or two. This is commonly known 
as SECDED (Single error correction double error detection). 
Correction Mechanism. When an ECC enabled system 
writes data to DRAM, the memory controller writes both the 
data bits and an additional string of bits, called the check hits. 
These bits offer the redundancy that enables detection and 
correction of errors. Together, the data and check bits make up 
a codeword, where the typical sizes for data and check bits are 
64 and 8 bits, respectively. Upon serving of a read request from 
DRAM, the memory controller reads both the data and check 
bits, and checks for errors. If an uncorrectable error is detected, 
the controller typically crashes the machine, rather than letting 
the software operate on corrupted data. Alternately, if the error 
can be corrected, the memory controller first corrects the error, 
and only then passes the corrected value to the software. We 
note that ECC correction and detection occurs only during 
read requests, and that a bit flip will go undetected until a 
codeword is read from the DIMM. 

Detecting Bit Flips. As Cojocar et al. [13] describe, this 
synchronous error correction results in a timing side channel 
that allows an attacker to determine if a single-bit error has 
occurred. They found that the overhead incurred by correctable 
bit flips is on the order of hundreds of thousands of cycles, 
which the attacker can easily measure. 

Concretely, we can detect the presence of a bit flip in 
any given word by measuring the read latency from the 
word. When we read from a word with a single-bit error, the 
hardware must first complete the ECC algorithm, and often log 
the error in the firmware log, before the value from the read is 
returned. If we observe a much longer read latency, it indicates 
that a bit flip occurred sometime after the last time that the 
same 64 bit word was read from. This effect is illustrated in 
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Fig. 6: Read latencies for the 64-bit words in a single 
page. When ECC corrects an error, the latency is 5 orders 
of magnitude greater than the common case. This can be seen 
by the peak for the 186th word, which indicates a bit flip. 


Figure 6; after performing double-sided hammering on the two 
aggressor rows, we read from the victim row and observe a 
crisp peak for the 186th word, clearly indicating a bit flip. 

B. RAMBleed on ECC Memory 

We now show how we can leverage the ability to detect the 
presence of corrected bit flips to read information from the 
victim’s address space. To the best of our knowledge, this is 
the first demonstration of security implications of corrected bit 
flips. 

Experimental Setup. Following the Intel-1 setup of 
Cojocar et al. [13], we demonstrate the RAMBleed attack on 
ECC memory on a Supermicro XIOSLL-F motherboard (BIOS 
version 3.0a) equipped with an Intel Xeon E3-1270 v3 CPU 
and a using a pair of Kingston 8GB 1333 MHz ECC DIMMs, 
model number KVR1333D3E9SK2. 

Templating. As with the non-ECC attack, we begin by first 
templating memory to locate bit flips. We do so in much the 
same manner of [13], only with an algorithmic improvement 
for determining which bit in a row is the fiippable bit. 

Cojocar et al. [13] locate bit flips by performing double 
sided Rowhammer, and then using the timing side channel to 
locate a word containing a bit flip. They determine which of 
the 64 bits flipped by setting exactly one of the bits to its 
charged state, while all the rest are discharged. This results in 
the targeted bit being in the middle of a striped configuration, 
while all the other bits in the word are part of a uniform 
configuration. Next, a long read latency indicates that the 
single charged bit flipped. Einally, they repeat the process for 
each bit to determine which bits can be flipped. 

To speed up the process of templating memory for bit 
flips, we replace the single-bit iteration phase with a binary 
search over the possible locations for the bit flip. That is, 
after locating a word with a bit flip, we set half of the bits 
to their charged state, with the other half discharged. We 
then hammer the aggressor rows again, and record the read 
latency. If it is long, then the bit flip lies in the half with the 
charged bits; otherwise, it lies in the other half. We repeatedly 
reduce the search space by half in this manner, until we have 
pinpointed the location of the bit flip. Overall this speeds up 
the templating phase of Cojocar et al. [13] by a factor of 10. 
Reading Bits. After profiling memory and recording the pre¬ 
cise locations of fiippable bits, we use the memory massaging 


and Erame Eeng Shui techniques described in Section V to 
achieve the double-sided RAMBleed configuration. In the non- 
ECC RAMBleed case, we hammered the aggressor rows and 
subsequently directly read the victim row for a Rowhammer- 
induced bit flip, thereby leaking values of secret bits. With 
ECC, we cannot observe the flips directly. Instead we use the 
timing side channel and look for long read latencies. As such 
latencies occur only due to Rowhammer-induced flips, they 
can be used to reveal the value of the secret bit as described 
in Section IV. 

Experimental Results. We can successfully read bits via 
RAMBleed against ECC memory with a 73% accuracy at 
a reading rate of 0.64 bits/second in our setup. Since ECC 
DIMMs are typically built using the same chips as used on 
non-ECC DIMMs, but with an additional chip for storing the 
check bits, we attribute the drop in accuracy to the fact that 
they are simply different sets of DIMMS. 

IX. Mitigations 

Unlike previous Rowhammer attacks which compromise 
integrity, RAMBleed is an attack which compromises con¬ 
fidentiality. Moreover, to leak information cross process and 
cross address space, RAMBleed only requires that the attacker 
can read and hammer her own private memory, and does not 
involve any access or modification to the target’s data, code, 
or address space. As such, RAMBleed can bypass software- 
based integrity checks that might be applied to the target, 
such as using message authentication codes (MAC) to protect 
the target’s data. Moreover, techniques designed to protect 
cryptographic systems against fault attacks (such as Shamir’s 
countermeasure [56]) are also ineffective as they again protect 
the integrity of the cryptographic computation and not its 
confidentiality. Other software defenses, such as Brasser et 
al.’s [8] memory partitioning scheme do not mitigate our 
attack, as we are not trying to read from kernel memory. 

A. Hardware Mitigations 

There are, however, a few commonly proposed hardware- 
based mitigations that have the potential to mitigate RAM- 
Bleed. Kim et al. [34] propose PARA (probabilistc adjacent 
row activation), wherein activating a row causes nearby rows 
to activate with some probability. Repeated hammering of an 
address then increases the likelihood that nearby victim rows 
will be refreshed, thereby restoring their cells’ charges and 
preventing Rowhammer. PARA has not been widely adopted, 
as it can only proide a probabilistic security guarantee. 
Targeted Row Refresh (TRR). The more recent LPDDR4 
standard supports the ability to refresh a targeted row with 
TRR, where after a row is accessed a set number of times, 
the nearby rows are automatically refreshed [31]. Despite 
this mitigation, [21, 61] already report the ability to induce 
Rowhammer bit flips in the presence of TRR. 

Increasing Refresh Intervals. Doubling DRAM refresh 
rate by halving the refresh interval from 64ms to 32ms is an 
attempt at reducing the number of bit flips by refreshing victim 
rows. However, this is impractical on mobile systems due to 
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the increased power demands. Worse yet, Aweke et al. [2] and 
Gruss et al. [21] demonstrate bit flips even under this setting. 
Using Error Correcting Codes (ECC). An oft-touted 
panacea for Rowhammer is the usage of ECC memory, as 
any bit flip will simply be corrected by the hardware without 
affecting the software layer. However, as we show in Sec¬ 
tion VIII, the hardware error correction implementation actu¬ 
ally produces sufficient side channel information for mounting 
RAMBleed. Thus, while ECC significantly slows RAMBleed, 
it does not offer complete protection. 

B. Memory Encryption 

One defense that does in fact protect against RAMBleed 
is memory encryption. This is because RAMBleed reads bits 
directly from memory, which are ciphertext bits in the case that 
memory is encrypted. Trusted execution environments, such 
as Intel’s Software Guard Extensions (SGX), ARM’s Trust 
Zone, and AMD’s Secure Encrypted Virtualization (SEV), in 
fact fully encrypt the enclave’s memory, thereby protecting 
them from RAMBleed. It should be noted, however, that 
some enclaves, such as SGX, perform integrity checking on 
encypted memory; Jang et al. [28] and Gruss et al. [21] show 
that Rowhammer induced flips in enclave memory halt the 
entire machine, necessitating a power cycle. 

C. Flushing Keys from Memory 

Eor systems that use sensitive data for a short amount of 
time (e.g., cryptographic keys), zeroing out the data immedi¬ 
ately after use [22] would significantly reduce the risk from 
RAMBleed. This is because RAMBleed cannot accurately read 
bits of keys that do not remain in memory for at least one 
refresh interval (64ms by default). While this countermeasure 
is effective for protecting short lived data, it cannot by used 
for data that needs to stay in memory for long durations. 

D. Probabilistic Memory Allocator 

Our Erame Eeng Shui technique exploits the deterministic 
behavior of the Linux buddy allocator to place the vic¬ 
tim’s pages in specific locations. Consequently, introducing 
a sufficient amount of non-determinism into the allocation 
algorithm will prevent the attacker from placing secrets into 
vulnerable locations. Such a defense would not, however, 
necessarily defeat a RAMBleed attacks that use probabilistic 
memory spraying techniques similar to [55]. The attacker 
could potentially keep many SSH connections open at once, 
and then hammer and read from the locations with the correct 
RAMBleed configurations. The attacker could use the row- 
buffer timing side-channel to detect the correct configurations. 

X. Limitations and Euture Work 

RAMBleed’s primary limitation is that it requires the victim 
process to allocate memory for its secret in a predictable 
manner in order to reliably read bits of interest. Otherwise, 
the Erame Eeng Shui process described in Section V-C will 
not place the secret page in the intended frame. It may be 
possible, however, to bypass this limitation by using Yarom 


and Ealkner’s [65] Elush and Reload technique to determine 
when the secret page is about to be allocated. 

Another limitation is that our attack against OpenSSH 7.9 
required the the daemon to allocate the key multiple times. We 
conjecture, however, that it may be possible to read secrets 
even when they are never reallocated by the victim. If the 
secret lies in the page cache, it is likely possible to use 
Gruss et al.’s [21] memory waylaying technique to repeatedly 
evict the secret and then bring it back into memory, thereby 
changing its physical address. Even if it does not lie in the page 
cache, the attacking process can still evict it by exhausting 
enough memory to start paging memory to disk. Both of 
these strategies would, however, be defeated by using Linux’s 
miock system call to lock secret pages into memory, thereby 
preventing them from ever being evicted to disk. 

Next, while we demonstrated our attack on a system using 
DDRS DRAM, we do not suspect DDR4 to be a fundamental 
limitation, assuming that DDR4 memory retains the property 
that Rowhammer-induced bit flips are data-dependent. Our 
techniques for recovering physically sequential blocks depend 
only on the operating system’s memory allocation algorithm, 
and are thus hardware agnostic. With regard to finding pairs of 
addresses in different rows of the same bank, [49] have already 
demonstrated how to reverse engineer the DRAM addressing 
scheme in DDR4 systems. Eurthermore, Rowhammer-induced 
bit flips in DDR4 have been demonstrated by [1, 21, 37]. We 
leave the composition of these results to achieve RAMBleed 
on DDR4 memory to future work. 

Einally, RAMBleed’s rate of reading memory is modest, 
toping at around 3-4 bits per second. This allows sufficient 
time for memory scrubbing countermeasures to remove short¬ 
lived secret data from the target’s memory. We thus leave the 
task of improving RAMBleed’s read rate to future work. 

XL Conclusion 

In this paper, we have shifted Rowhammer from being a 
threat only to integrity to also being a threat to confidentiality. 
We demonstrated the practical severity of RAMBleed by con¬ 
ducting and end-to-end exploit against OpenSSH 7.9, in which 
we extracted the complete 2048 bit RSA private signing key. 
To do so, we also developed memory massaging methods and 
a technique called Frame Feng Shui that allows an attacker to 
place the victim’s secret-containing pages in chosen physical 
frames. By uncovering another channel for Rowhammer based 
exploitation, we have highlighted the need to further explore 
and understand the complete capabilities of Rowhammer. 
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